Method for processing and accessing data objects, particularly documents, and system therefor

ABSTRACT

A method for processing and accessing data objects including a step (II) of gathering collected data objects (D) into groups of objects (DP 1 , DP 2 , DPM) associated with respective geographical areas (P 1 , P 2 , PM), a step (III) of classifying each object according to several categories of objects (DB 1 , DR 1 , DN 1 ), a first step (IV) of classifying the objects into at least one of the object categories (DB 1 ) according to an index common to all the object groups (DP 1 , DP 2 , DPM), and a second step (V) of classifying a part of the collected objects forming a particular group of objects (CR 1 , CR 2 , CRM) according to a second so-called hierarchical classification scheme (PH) common to all the object groups (DP 1 , DP 2 , DPM). The method further includes a step of navigating through selected geographical areas via either of the first and second classification schemes (PT, PH). The method is particularly useful for distributing and accessing documents on medical and pharmaceutical regulations.

DESCRIPTION

The present invention relates to a method for processing and accessinginformation objects, especially documents. It also envisages a systemfor implementing this method.

The considerable growth in information exchange in all fields of humanactivity, especially in the economic field, has been made possible bythe development of methods and systems for processing and communicatinginformation. Mention may be made here of the development of databaseswhich may be easily accessible on-line via telecommunications networks.Another, off-line, database consultation mode which is currently beingwidely developed consists in using permanent information media,especially CD-ROM (Compact-Disc Read-Only Memory) or CD-I (InteractiveCompact Disc) discs, which are consulted by persons using appropriatedisc drives associated with microcomputers. This consultation mode ismeeting with increasing success due to the very high storage capacitiesof the CD-ROM discs and the possibility of locally processing andstoring the information consulted. The CD-ROM discs produced anddistributed by the operator or the manager of the database generallyinclude software consultation modules, in addition to the consultabledata, while the microcomputer to which the CD-ROM drive is connected isequipped with resident search and processing software.

The objects now stored on CD-ROM discs are very diverse, since, in themultimedia applications, these objects include not only computer data,graphics objects, sound elements or even video or animation sequences.The consultation software associated with the CD-ROM drives currentlyalready offers very extensive facilities for searching and accessinginformation, and for processing and displaying this information. Hence auser can navigate from one information object to another, depending onhis objectives and on his routing within a structured object database.Although the very high storage capacity of a medium of CD-ROM typeallows problem-free storage of a large number of document pages, itnevertheless becomes difficult, even for a practised user, to gainaccess in a reasonable time to an information object being searched for,when the database includes several thousand documents.

An information object is understood to be a set of codes and signsconstituting a coherent whole representing data, information of allkinds, text, graphics, image or sound.

A document is understood to be a set of coherent information or dataitems, organised into pages and relating to a particular topic orsubject.

Methods are therefore already known for processing and accessinginformation objects, especially documents, comprising:

on at least one central site:

stages of collecting objects previously produced and input,

stages of classifying the objects thus collected, and

at least one stage of generating a structured database containing theobjects thus classified,

on at least one production site,

stages of writing the structured database thus generated onto permanentinformation media,

and, on several local consultation sites,

stages of accessing one or more objects searched for within thestructured database, these accessing stages comprising stages ofsearching for the objects and stages of reading these objects from apermanent information medium.

On this subject, mention may be made of methods for processing and foraccessing the information contained in encyclopaedias or dictionaries,which employ massive prior information input, classifying of thisinformation, for example in alphabetical order or by topic, writing ofthis processed and classified information onto CD-ROM discs,distribution of these discs and consultation thereof by users onstations equipped with appropriate disc drive means. Mention may also bemade of the case of methods for consulting patent databases on CD-ROMdisc.

However, a simple adaptation of these known methods for processing andmaking available complex information, such as information relating toregulations having to be constantly updated and exhibiting severallevels of complexity, is difficult to envisage. It would lead to methodswhich, in the end, would be difficult to handle and which would becumbersome to manage, especially for updating, when there was arequirement for processing a large number of objects such as documentsrelating to the international regulations on pharmaceuticals. Theprocessing of such documents, collected from a very large number ofsources, subject to frequent updates and exhibiting a geographical,time-based and topic-based dimension, would therefore prove to beexpensive and difficult to access with current methods.

The object of the invention is to remedy these drawbacks by proposing amethod for processing and accessing information objects which allows, onthe one hand, easy updating of the document database intended to bewritten onto the permanent information media, and, on the other hand,easy access to objects searched for within the database.

According to the invention, the method for processing and accessinginformation objects, especially documents, comprising:

on at least one central site:

stages of collecting objects,

stages of classifying the objects thus collected, and

stages of generating a structured database containing the objects thusclassified,

on at least one production site,

stages of writing the structured database thus generated onto permanentinformation media,

and, on several local consultation sites,

stages of accessing one or more objects within the structured database,these accessing stages comprising stages of searching for the objects ona permanent information medium,

is characterised in that each classification stage comprises:

a grouping of the collected objects together into groups of objectsassociated respectively with given geographical areas,

a categorisation of each object into several types of objects,

a first classification of the objects belonging to at least one of thetypes of objects, according to a table of contents including a set oftopics and common to all the groups of objects, this table of contentsbeing associated with a first classification scheme known as topic-basedclassification scheme,

a second classification, in each group of objects, of some of thecollected objects constituting a particular group of objects, accordingto a second classification scheme, called hierarchical scheme, common toall the groups of objects,

and in that each stage of accessing an object further comprises a stageof selecting at least one geographical area followed by a stage ofnavigating within the selected geographical areas, this navigation stagebeing capable of covering either of the first and second classificationschemes.

The technical effects obtained with the method according to theinvention are especially the relevance of the texts with regard to thefield dealt with, the freshness and comprehensiveness of theinformation, whether it relates to regulatory texts or practicalinformation, and the ease of searching for a document of the database.

With the method according to the invention, it becomes possible to adaptto the behaviour of a large number of users, to the various frequenciesof their requirements and to their variable knowledge of the field.

According to one preferred implementation of the method according to theinvention, the method further comprises, within the central site, astage of selecting key elements within each collected object, these keyelements being grouped together within a thesaurus, and the accessingstages comprising stages of selection according to several criteriaincluding a criterion of selection by key element.

The method according to the invention is preferably implemented forprocessing and accessing documents, and the collected documents arecategorised by type into basic documents, reference documents and notes.

According to another aspect of the invention, a system is proposed forprocessing and accessing objects, for implementing the method accordingto the invention, comprising:

on at least one central site:

means for collecting objects,

means for classifying the objects thus collected, and

means for generating a structured database containing the objects thusclassified,

on at least one production site,

means for writing the structured database thus generated onto permanentinformation media,

and, on several local consultation sites,

means for accessing one or more objects within the structured database,these accessing means comprising means for searching for the objects ona permanent information medium,

characterised in that the classification means comprise:

means for grouping the collected objects together into groups of objectsassociated respectively with given geographical areas,

means for categorising each object into several types of objects,

means for carrying out a first classification of the objects belongingto at least one of the types of objects, according to a table ofcontents including a set of topics and common to all the groups ofobjects, this table of contents being associated with a firstclassification scheme known as topic-based classification scheme,

means for carrying out a second classification of some of the collectedobjects constituting a particular group of objects, according to asecond classification scheme called hierarchical scheme, common to allthe groups of objects,

and in that the means for accessing an object further comprise means forselecting at least one geographical area and means for navigating withinthe selected geographical areas, these navigation means being configuredso as to cover either of the first and second classification schemes.

Other features and advantages of the invention will emerge further fromthe description below. On the attached drawings given by way ofnon-limiting examples:

FIG. 1, which is divided into FIG. 1A and FIG. 1B, is a block diagram ofthe processing stages of the method according to the invention, from thecollecting of documents to the printing of the permanent informationmedia; and

FIG. 2 is a block diagram of the stages of accessing the documents, inthe method according to the invention.

A preferred implementation of a processing and access method inaccordance with the invention will now be described, at the same time asan exemplary embodiment of a processing and access system forimplementing this method, by reference to the abovementioned FIGS. 1 and2.

In the practical example of implementation of the processing and accessmethod according to the invention, which will be described in whatfollows, a document database covering regulations on the registration ofpharmaceuticals is to be made available to users, this database beingconsulted on a microcomputer equipped with means for reading CD-ROMdiscs. In this example, the objects collected, sorted, classified andsearched are documents consisting of texts, tables and possiblygraphics.

The stages for producing the document database will firstly bedescribed, with reference to FIG. 1. A first stage I of the methodaccording to the invention consists in collecting documents D, within acentral site SC, especially documents drawn up by experts R1, Rk, . . ., RN or communicated by authorised bodies and institutions. Thesecollected documents CO are input and formatted according to a commonformat. A second stage II consists in grouping together RG the variouscollected documents into several groups of documents DP1, DP2, . . . ,DPM by geographical area, in this instance by country P1, P2, . . . , PMand/or by regional area such as the European Union. The following stageIII may be a stage of categorising the documents for each country P1,P2, . . . , PM into several types of documents. By way of example, thefollowing types can be envisaged: basic documents DB1, referencedocuments DR1 and notes DN1. Moreover, this categorisation into typescould very well be carried out before the grouping by country.

In a fourth stage IV, the basic documents DB1 are classified accordingto a table of contents TM which is common to all the countries. Thecorresponding classification scheme PT is preferably a tree structure oftopics including a large number of levels, for example twelve.

The method according to the invention also envisages a stage V ofclassifying the documents constituting a body of regulations CR1, CR2, .. . , CRM (represented as crosshatched), this classifying being carriedout according to a hierarchical scheme PH common to all the countriesP1, P2, . . . , PM. This hierarchical scheme PH includes several levels,and may also be structured in tree structure form.

Each document may include one or more keywords MC characterising asubject dealt with and which are intended for multi-criterion searches.These keywords, as well as key expressions, are grouped together withtheir search alias into a thesaurus TH. Within the documents markersLHi, LHj, LHk are also available, corresponding to hypertext linksallowing navigation between different documents of the documentdatabase, according to a technique which is well known and widespreadnowadays.

Hence, on completion of the abovementioned processing stages I-VI, adocument database BD is available, structured by country P1, P2, . . . ,PM, with which a table of contents scheme PT, a hierarchical scheme PHand a thesaurus TH are associated.

It should be noted that the abovementioned processing stages I-VI may becarried out in a chronological order other than that which has just beendescribed. Thus, the documents may be allocated keywords and markers forhypertext links before being grouped together by country. The same istrue for their categorisation by type, which can also be carried outbefore the grouping by country.

On completion of processing stages I-VI, in a printing stage VII carriedout on a production site SP, the document database BD and a dedicatedapplication generator LD are written according to a known reproductionprocess ED onto permanent information media CD, such as CD-ROM discs orany other equivalent medium. These media are then distributed to users.

The various accessing stages performed on a local site SL in the courseof the use of a document database BD produced with the method accordingto the invention will now be described by reference to FIG. 2.

After insertion of a disc CD including the document database BD inquestion into a disc drive LC of a workstation 1 normally provided withresident operating software LE, and starting of this software, theapplication generator LD and the resident operating software LEcooperate to run the document database. The user preferably has awindow-mode presentation available, which is widespread onmicrocomputers. In a first window, he is generally invited (SP) toselect one or more countries. This selection has the effect ofdistinguishing, in the document database DB, those groups of documentsDPk corresponding to the selected country or countries. The user isoffered at least two ways of searching (RD) for a document.

A first way (A) consists in carrying out a search (A1) for basicdocuments by consulting (A2) the table of contents TM. This has thetechnical effect of navigating dynamically and graphically through thetree structure until the user identifies the topic or topics whichinterest him. When he has selected (A3) a topic, the method according tothe invention indicates the number of documents DC corresponding to thistopic, offers access (A4) to these documents and supplies a summary foreach of the said documents which the user can consult (A5) with a viewto asking for one or more relevant documents to be displayed.

A second way (B) proposed to the user consists of a multi-criterionselection of any document from the document database BD. In a firststage (B1), the user chooses a combination of search criteria (B2) whichmay especially be:

the type(s) of documents,

the word(s) which are to appear in the document,

the keyword(s) attached to the document.

The method according to the invention then performs a multi-criterionsearch (B3) which leads to a result of the request (B4) generally in theform of a number of documents identified as satisfying the chosencombination of criteria. This result allows the user to determinewhether the search has been satisfactory (B5), in which case he canconsult (B6) the document or documents identified. Otherwise the usermay be led either to narrow the search (B7), or to widen this search(B8).

For the multi-criterion search, it is also possible to envisage alogical combination of words leading to a search of the “full text”type. A display of lists presenting the user with possible choices(“full text” index, index of keywords) facilitates the expressing of thesearch request.

Moreover, when a document consulted by the user includes markers forhypertext links LHi, LHj, LHk, represented by a graphics style, forexample colour and/or bold characters, the user can navigate betweendifferent documents of the database following the hypertext linkslinking these documents and representing a logical association betweenthem. These hypertext links result from a cross-analysis carried out bythe writers of the documents, and may be of three kinds:

reference links which allow direct access to a text quoted in a sourcedocument,

structural links which connect together different parts of the same textwhich has been segmented for easier readability, and

comment-based links, particularly allowing access to personal noteswhich the user can attach to certain documents.

It is also possible to envisage hypertext links offering immediateaccess to a definition of an expression, or to a meaning of an acronymvia a direct call to a glossary incorporated in the document database.

Another search mode offered by the method according to the inventionconsists in navigating interactively through the hierarchicalclassification scheme PH of the body of regulations CR1, CR2, . . . ,CRM of one or more selected countries P1, P2, . . . , PM, in selecting adocument and in displaying it in order to consult it.

The processing and access method according to the invention offers themanager of the document database great flexibility in updating thatdatabase. This is because, as and when new documents are received fromthe experts writing them, these documents can be processed continuouslyand incorporated into the geographical groups to which they correspond.The new documents are then substituted for the now obsolete orsuperseded documents, are categorised and incorporated in theappropriate topic-based levels within the tree structure. New editionsof constantly updated discs are produced and distributed periodically.

Clearly, the invention is not limited to the examples which have justbeen described, and many other configurations can be applied to theseexamples without departing from the scope of the invention. Thus, thenumbers of geographical areas, of types of documents and of topics ofthe table of contents may be chosen on the basis of the users'requirements and of market changes. The permanent information media maybe CD-ROM discs, as has just been described, but also CD-I (InteractiveCompact Disc) discs or any other permanent information medium, whetherit can be rewritten or not.

What is claimed is:
 1. A method for processing and accessing informationobjects, comprising: on at least one central site: stages of collectingobjects, stages of classifying the objects collected, and stages ofgenerating a structured database containing the objects classified,characterized in that each classification stage comprises: a grouping ofthe collected objects together into groups of objects associatedrespectively with given geographical areas, a categorization of eachobject into several types of objects, a first classification, in eachgroup of objects, of the objects belonging to at least one of the typesof objects, according to a table of contents including a set of topicsand common to all the groups of objects, said table of contents beingassociated with a topic-based classification scheme, a secondclassification, in each group of objects, of some of the collectedobjects constituting a particular group of objects, according to ahierarchical classification scheme common to all the groups of objects,and in that it further comprises, within the central site, a stage ofselecting key elements within each collected object, said key elementsbeing grouped together within a thesaurus, and in that the accessingstages comprise stages of selection according to at least one criterionincluding a criterion of selection by key element.
 2. The methodaccording to claim 1, further comprising on at least one productionsite, stages of reproducing the structured database generated onpermanent information media, and, on several local consultation sites,stages of accessing one or more objects within the structured database,said accessing stages comprising stages of searching for the objects ona permanent information medium, characterized in that each stage ofaccessing an object further comprises a stage of selecting at least onegeographical area followed by a stage of navigating within the selectedgeographical areas, said navigation stage being capable of coveringeither said topic-based classification scheme or said hierarchicalclassification scheme.
 3. The method of claim 2 for processing andaccessing documents, characterized in that the collected documents arecategorized by type into basic documents, reference documents and notes.4. The method of claim 1 for processing and accessing documents,characterized in that the collected documents are categorized by typeinto basic documents, reference documents and notes.
 5. A system forprocessing and accessing objects, comprising: on at least one centralsite, means for collecting objects, means for classifying the objectscollected, and means for generating a structured database containing theobjects classified, on at least one production site, means forreproducing the structured database generated on permanent informationmedia, and, on several local consultation sites, means for accessing oneor more objects within the structured database, said accessing meanscomprising means for searching for the objects on a permanentinformation medium, characterized in that the classification meanscomprise: means for grouping the collected objects together into groupsof objects associated respectively with given geographical areas, meansfor categorizing each object into several types of objects, means forcarrying out, in each group of objects, a first classification of theobjects belonging to at least one of the types of objects, according toa table of contents including a set of topics and common to all thegroups of objects, said table of contents being associated with atopic-based classification scheme, means for carrying out, in each groupof objects, a second classification of some of the collected objectsconstituting a particular group of objects, according to a hierarchicalclassification scheme common to all the groups of objects, means forselecting key elements within each collected object and means forgrouping said key elements together within a thesaurus.
 6. The systemaccording to claim 5, characterized in that said means of accessing anobject further comprise means for selecting an object according to atleast one criterion including a criterion of selection by key element.7. The system according to claim 6, characterized in that said means foraccessing an object further comprise means for selecting at least onegeographical area and means for navigating within the selectedgeographical areas, said navigation means being configured so as tocover either said topic-based classification scheme or said hierarchicalclassification scheme.
 8. The system according to claim 5 characterizedin that said means for accessing an object further comprise means forselecting at least one geographical area and means for navigating withinthe selected geographical areas, said navigation means being configuredso as to cover either said topic-based classification scheme or saidhierarchical classification scheme.